Aspects concerning on the SVM Method’s Scalability

نویسندگان

  • I. D. MORARIU
  • M. VINTAN
  • L. N. VINTAN
چکیده

In the last years the quantity of text documents is increasing continually and automatic document classification is an important challenge. In the text document classification the training step is essential in obtaining a good classifier. The quality of learning depends on the dimension of the training data. When working with huge learning data sets, problems regarding the training time that increases exponentially are occurring. In this paper we are presenting a method that allows working with huge data sets into the training step without increasing exponentially the training time and without significantly decreasing the classification accuracy.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Unbalanced and Partial L1 Monge–Kantorovich Problem: A Scalable Parallel First-Order Method

We propose a new algorithm to solve the unbalanced and partial L1-Monge– Kantorovich problems. The proposed method is a first-order primal-dual method that is scalable and parallel. The method’s iterations are conceptually simple, computationally cheap, and easy to parallelize. We provide several numerical examples solved on a CUDA GPU, which demonstrate the method’s practical effectiveness.

متن کامل

Biomedical Named Entity Recognition Using Support Vector Machines: Performance vs. Scalability Issues

This paper examines the performance and scalability of Named Entity Recognition (NER) using multi-class Support Vector Machines (SVM) and high-dimensional features. The NER domain chosen for these experiments is the biomedical publications domain, especially selected due to its importance and inherent challenges. We use a simple machine learning approach that eliminates prior language knowledge...

متن کامل

Title: Using machine learning methods to predict experimental high- throughput screening data

High-throughput screening (HTS) remains a very costly process notwithstanding many recent technological advances in the field of biotechnology. In this study we consider the application of machine learning methods for predicting experimental HTS measurements. Such a virtual HTS analysis can be based on the results of real HTS campaigns carried out with similar compounds libraries and similar dr...

متن کامل

Time Properties Dedicated Semantics for UML-MARTE Safety Critical Real-Time System Verification

Critical real-time embedded systems (RTES) crucially have strong requirement concerning system’s reliability. UML and its profile MARTE are standardized modeling language that are getting widely accepted by industrial designers to cope with the development of complex RTSE. In Model-driven engineering, verification at early phases of the system lifecycle is an important problem, which remains op...

متن کامل

Method for Aspect-Based Sentiment Annotation Using Rhetorical Analysis

This paper fills a gap in aspect-based sentiment analysis and aims to present a new method for preparing and analysing texts concerning opinion and generating user-friendly descriptive reports in natural language. We present a comprehensive set of techniques derived from Rhetorical Structure Theory and sentiment analysis to extract aspects from textual opinions and then build an abstractive sum...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007